智能论文笔记

GPT-3-driven pedagogical agents for training children's curious question-asking skills

Rania Abdelghani , Yen-Hsiang Wang , Xingdi Yuan , Tong Wang , Hélène Sauzéon , Pierre-Yves Oudeyer

分类：自然语言处理

2022-11-25

Students' ability to ask curious questions is a crucial skill that improves their learning processes. To train this skill, previous research has used a conversational agent that propose specific cues to prompt children's curiosity during learning. Despite showing pedagogical efficiency, this method is still limited since it relies on generating the said prompts by hand for each educational resource, which can be a very long and costly process. In this context, we leverage the advances in the natural language processing field and explore using a large language model (GPT-3) to automate the generation of this agent's curiosity-prompting cues to help children ask more and deeper questions. We then used this study to investigate a different curiosity-prompting behavior for the agent. The study was conducted with 75 students aged between 9 and 10. They either interacted with a hand-crafted conversational agent that proposes "closed" manually-extracted cues leading to predefined questions, a GPT-3-driven one that proposes the same type of cues, or a GPT-3-driven one that proposes "open" cues that can lead to several possible questions. Results showed a similar question-asking performance between children who had the two "closed" agents, but a significantly better one for participants with the "open" agent. Our first results suggest the validity of using GPT-3 to facilitate the implementation of curiosity-stimulating learning technologies. In a second step, we also show that GPT-3 can be efficient in proposing the relevant open cues that leave children with more autonomy to express their curiosity.

translated by 谷歌翻译

近年来，大型语言模型（LLMS）在自然语言产生中表现出了令人印象深刻的实力。提高发电多样性的一种常见做法是从模型中采样多个输出。但是，缺乏一种简单且可靠的方式来从这些随机样品中选择最佳输出。作为一个案例研究，在问题产生的背景下，我们提出了两种基于迅速的方法，以从一组LLM生成的候选人中选择高质量问题。我们的方法在1）限制下起作用，一个黑框（不可修改）问题生成模型和2）缺乏访问人类宣传的参考文献 - 这两者都是现实世界中LLMS的现实局限性。通过自动和人类评估，我们从经验上证明，我们的方法可以有效地选择比贪婪的生成更高质量的问题。

translated by 谷歌翻译

量化是在嵌入式系统或手机上部署训练有素的DNN模型时，是最应用的深神经网络（DNN）压缩策略之一。这是由于其对广泛的应用和情况的简单性和适应性，而不是特定的人工智能（AI）加速器和编译器，这些加速器和编译器通常仅用于某些特定的硬件（例如Google Coral Edge TPU）。随着对量化的需求不断增长，确保该策略的可靠性成为一个关键挑战。传统的测试方法收集越来越多的真实数据以进行更好的评估，通常是不切实际的，因为输入空间的尺寸很大，并且原始DNN及其量化的对应物之间的相似性很高。结果，高级评估策略已变得至关重要。在本文中，我们提出了Diverget，这是一个基于搜索的测试框架，用于量化评估。 Diverget定义了变质关系的空间，该空间模拟了输入上的自然扭曲。然后，它最佳地探索了这些关系，以揭示不同算术精度的DNN之间的分歧。我们评估了应用于高光谱遥感图像的最先进的DNN上的Diverget的性能。我们选择了遥感DNN，因为它们越来越多地部署在诸如气候变化研究和天文学之类的关键领域中的边缘（例如，高级无人机）。我们的结果表明，Diverget成功地挑战了已建立的量化技术的鲁棒性，以防止自然变化的数据，并胜过其最新的并发，Diffchaser，其成功率（平均）是四倍。

translated by 谷歌翻译

我们考虑合成任意长度的多动运动人类运动序列的问题。现有方法已经掌握了单一方案中的运动序列生成，但未能推广到多动和任意长度序列。我们通过提出一种新型有效方法来填补这一空白，该方法利用了经常性变压器的表现力和条件变异自动编码器的生成丰富性。所提出的迭代方法能够在线性空间和时间进行任意数量的动作和帧中生成平滑而逼真的人类运动序列。我们训练并评估使用基本操作标签增强的Prox数据集的建议方法。实验评估表明，与最先进的情况相比，FID得分和语义一致性指标的显着改善。

translated by 谷歌翻译